Search CORE

9 research outputs found

Combining Sentiment Lexica with a Multi-View Variational Autoencoder

Author: Augenstein Isabelle
Cotterell Ryan
Hoyle Alexander
Wallach Hanna
Wolf-Sonkin Lawrence
Publication venue
Publication date: 01/01/2019
Field of study

When assigning quantitative labels to a dataset, different methodologies may rely on different scales. In particular, when assigning polarities to words in a sentiment lexicon, annotators may use binary, categorical, or continuous labels. Naturally, it is of interest to unify these labels from disparate scales to both achieve maximal coverage over words and to create a single, more robust sentiment lexicon while retaining scale coherence. We introduce a generative model of sentiment lexica to combine disparate scales into a common latent representation. We realize this model with a novel multi-view variational autoencoder (VAE), called SentiVAE. We evaluate our approach via a downstream text classification task involving nine English-Language sentiment analysis datasets; our representation outperforms six individual sentiment lexica, as well as a straightforward combination thereof.Comment: To appear in NAACL-HLT 201

arXiv.org e-Print Archive

Copenhagen University Research Information System

Combining Sentiment Lexica with a Multi-View Variational Autoencoder

Author: Augenstein Isabelle
Cotterell Ryan
Hoyle Alexander Miserlis
Wallach Hanna
Wolf-sonkin Lawrence
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Copenhagen University Research Information System

Recommended from our members

A Structured Variational Autoencoder for Contextual Morphological Inflection

Author: Cotterell Ryan
Mielke Sebastian J
Naradowsky Jason
Wolf-Sonkin Lawrence
Publication venue: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Publication date: 01/01/2018
Field of study

Apollo (Cambridge)

The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection

Author: Cotterell Ryan
Heinz Jeffrey
Hulden Mans
Kirov Christo
Malaviya Chaitanya
McCarthy Arya D.
Mielke Sabrina J.
Nicolai Garrett
Silfverberg Miikka
Vylomova Ekaterina
Wolf-Sonkin Lawrence
Wu Shijie
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

The SIGMORPHON 2019 shared task on cross-lingual transfer and contextual analysis in morphology examined transfer learning of inflection between 100 language pairs, as well as contextual lemmatization and morphosyntactic description in 66 languages. The first task evolves past years' inflection tasks by examining transfer of morphological inflection knowledge from a high-resource language to a low-resource language. This year also presents a new second challenge on lemmatization and morphological feature analysis in context. All submissions featured a neural component and built on either this year's strong baselines or highly ranked systems from previous years' shared tasks. Every participating team improved in accuracy over the baselines for the inflection task (though not Levenshtein distance), and every team in the contextual analysis task improved on both state-of-the-art neural and non-neural baselines.Comment: Presented at SIGMORPHON 201

arXiv.org e-Print Archive

Crossref

On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs

Author: Blasi Damián
Cotterell Ryan
Wallach Hanna
Williams Adina
Wolf-Sonkin Lawrence
Publication venue: 'MIT Press - Journals'
Publication date: 03/05/2020
Field of study

We use large-scale corpora in six different gendered languages, along with tools from NLP and information theory, to test whether there is a relationship between the grammatical genders of inanimate nouns and the adjectives used to describe those nouns. For all six languages, we find that there is a statistically significant relationship. We also find that there are statistically significant relationships between the grammatical genders of inanimate nouns and the verbs that take those nouns as direct objects, as indirect objects, and as subjects. We defer deeper investigation of these relationships for future work.ISSN:2307-387

arXiv.org e-Print Archive

Repository for Publications and Research Data

On the distribution of deep clausal embeddings: a large cross-linguistic study

Author: Baroni Marco
Bickel Balthasar
Blasi Damian
Cotterell Ryan
Stoll Sabine
Wolf-Sonkin Lawrence
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Comunicació presentada a: 57th Annual Meeting of the Association for Computational Linguistics celebrat del 28 de juliol al 2 d'agost de 2019 a Florencia, Itàlia.Embedding a clause inside another (“the girl [who likes cars [that run fast]] has arrived”) is a fundamental resource that has been argued to be a key driver of linguistic expressiveness. As such, it plays a central role in fundamental debates on what makes human language unique, and how they might have evolved. Empirical evidence on the prevalence and the limits of embeddings has however been based on either laboratory setups or corpus data of relatively limited size. We introduce here a collection of large, dependency-parsed written corpora in 17 languages, that allow us, for the first time, to capture clausal embedding through dependency graphs and assess their distribution. Our results indicate that there is no evidence for hard constraints on embedding depth: the tail of depth distributions is heavy. Moreover, although deeply embedded clauses tend to be shorter, suggesting processing load issues, complex sentences with many embeddings do not display a bias towards less deep embeddings. Taken together, the results suggest that deep embeddings are not disfavoured in written language. More generally, our study illustrates how resources and methods from latest-generation big-data NLP can provide new perspectives on fundamental questions in theoretical linguistics

ZORA

UPF Digital Repository

MPG.PuRe